System and method for producing data and ECC code words using a high rate restricted-symbol code

ABSTRACT

An encoding system manipulates L m-bit data symbols or sequences in accordance with a “restricted-symbol” code to produce code words that include error correction code (ECC) redundancy information and also meet modulation requirements, such as run length. The system combines the data and associated redundancy information of a code word D of the underlying code and one or more predetermined symbols or sequences that are appended to the data code word with the corresponding symbols or bit sequences of a selected code word F, to produce a transmission code word C that consists of symbols or sequences that meet the modulation requirements. Thereafter, the system corrects any errors in the retrieved or received code word C using the included redundancy information and the L m-bit data symbols or sequences are then recovered by removing therefrom the contributions of the code word F. The system may instead use the restricted-symbol code strictly as a data code, by combining the respective m-bit data symbols or sequences and one or more predetermined symbols with one or more selected m-bit symbols or sequences, to produce L+1 m-bit symbols or sequences that meet the modulation requirements. The predetermined symbols or sequences are appended to the data to aid in decoding, with the corresponding symbols in the encoded code word or data symbols or sequences indicating to a decoder which selected code word, symbols or sequences have been combined with the data.

CROSS-REFERENCE TO RELATED APPLICATION

The present application claims the benefit of U.S. Provisional patentapplication Ser. No. 60/446,212, which was filed on Feb. 10, 2003, byLih Weng for MODULATED LINEAR CODES AND REED-SOLOMON CODES WITHRESTRICTED SYMBOLS and is hereby incorporated by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

This invention relates generally to data processing systems and, inparticular, to systems that encode data for error correction andmodulation purposes.

2. Background Information

Before data are transmitted over a communications channel to a receiveror a data storage device, the data are typically encoded twice, once toallow error detection and/or correction and then again for signalmodulation purposes. The error correction/detection encoding manipulatesthe data in accordance with a distance d error correction code (ECC), toproduce data code words that include the data and associated redundancyinformation. The modulation encoding encodes sequences of data andredundancy information into longer modulation code sequences that meetdesired run lengths and so forth. The modulation encoding promotesrecovery of the respective bits and/or symbols that comprise the dataand redundancy information from the transmitted or stored signalsessentially by ensuring that transitions between various signal levelsoccur at least every predetermined numbers of bits.

To decode the data and associated redundancy information from receivedor retrieved signals, the decoder uses the modulation code to recoverthe bit sequences. The system then groups the bits into symbols orsequences of appropriate lengths, to reproduce the data code words. Thesystem next decodes the data code words using the ECC to produce, ifpossible, error-free data.

The modulation code decoding includes in the decoded bit sequenceserrors induced by the communications channel, over which the bitsequences are transmitted to and from storage media or to a receiver.The decoding itself also introduces further errors associated with themisinterpretation of the bits of the various modulation code sequences.The errors introduced into the bit sequences by the decoding process arecommonly referred to as “propagation errors.” The propagation errors mayaffect multiple symbols that are included in the same or in multipledata code words, which may, in turn, result in uncorrectable errors inthe data. Accordingly, more powerful error correction is required toprotect against the propagation errors. The system must thus includemore redundancy in the transmitted or stored information, andconsequently fewer data symbols may be transmitted within a given timeand/or stored within a given space. Further, the system must be mademore complex to operate with the more powerful error correction codesand/or techniques.

SUMMARY OF THE INVENTION

The invention is a system for manipulating data in accordance with arate L/(L+1) “restricted-symbol” linear code to produce code words thatinclude error correction code (ECC) redundancy information and also meetmodulation requirements, such as run length. The system eliminates theneed for a separate data modulation code, and thus, eliminates thesource of the propagation errors. For ease of understanding, theinvention is explained in terms of linear codes that are based onmultiple bit symbols. However, as discussed further herein, theinvention also includes linear codes that are based on bit sequencesand/or non-binary linear codes.

Basically, the current system produces a transmission code word C withm-bit symbols or bit sequences that meet the modulation requirements bycombining the data and associated redundancy information of a code wordD of the underlying linear code with the corresponding symbols or bitsequences of a selected code word F. The transmission code word C thusconsists of only the non-prohibited symbols, that is, of symbols thatare not prohibited by the modulation rules. Thereafter, the retrieved orreceived code word C is decoded using the linear code to first correctany errors introduced by the communications channel, and then the dataare recovered by removing therefrom the contributions of the code wordF.

As discussed below, the system may instead be used to manipulate datathat is not part of an ECC code word. The system may thus use the linearcode as strictly a data code, that is, encode data sequences or symbolsto meet the modulation requirements, as discussed below.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention description below refers to the accompanying drawings, ofwhich:

FIG. 1 is a functional block diagram of a system constructed inaccordance with the invention;

FIG. 2 is a functional block diagram of an alternative systemconstructed in accordance with the invention; and

FIG. 3 illustrates various matrices associated with the operations ofthe system.

DETAILED DESCRIPTION OF AN ILLUSTRATIVE EMBODIMENT

Before discussing the system operations in general, the system and itsoperations are discussed by way of examples using binary codes. The sametechniques may also be readily used, with some modification, withnon-binary codes.

In Section A, the system and its operations are described using as anexample a data code that combines data and selected symbols inbit-by-bit XOR operations. In Sections B and C the system and itsoperations are described using codes over GF(2^(m)), with Section Cdescribing the use of a Reed Solomon error correction code. Section Ddescribes in more detail certain operations of a general system thatoperates in accordance with the system discussed in section B.

The examples included herein are based on modulation requirements thatprohibit m-bit sequences or symbols that consist of all zeros or allones. Other modulation requirements may be used in addition orotherwise, such as prohibiting symbols that consist of patterns of 0101. . . or 1010 . . . , and so forth.

Section A

Referring now to FIG. 1, an encoding system 10 operates in accordancewith a high-rate data code that uses m-bit symbols. A bit-by-bit XORoperation is defined to combine two m-bit symbols. An encoder 12 encodesL data symbols d_(j) to produce L+1 symbols that meet the modulationrequirements. The encoder 12 consists of one or more XOR gates (notshown) that combine the data symbols with L+1 selected symbols.

Before combining the L data symbols with the selected symbols, a dataprocessor 14 appends a predetermined symbol d_(L) to the data symbols,to produce a sequence of L+1 symbols d_(j), for j=0, 1, . . . L. Theencoder 12 then combines the L+1 symbols with L+1 symbols selected by aselection processor 16, to produce an L+1 m-bit symbol sequence in whichthe respective symbols c_(j) meet the modulation requirements. Theoperations of the selection processor are discussed in more detail belowby way of an example.

In the example, m=4 and the prohibited symbols are the all zero symbol[0000] and the all one symbol [1111]. The selection processor 16determines, for the respective symbols d_(j), the various m-bit symbolsthat when combined therewith produce the prohibited values. For a givensymbol d_(j), the selection processor thus determines which symbolscombine with d_(j) to produce [0000] and [1111]. Thereafter, theselection processor selects from the remaining 4-bit symbols aparticular symbol c_(L) that can be combined with each of the symbolsd_(j) such that the respective symbolsc _(j) =c _(L) +d _(j) for j=0, 1, . . . Lmeet the modulation requirements. As discussed in more detail below, forease of decoding, the appended symbol d_(L) is preferably selected asthe all-zero symbol. The combination operation can also be though of asa matrix operation(c _(L) , c _(L-1) . . . c ₁ , c ₀)=(0, d _(L-1) , d _(L-2) . . . d ₁ ,d ₀)⊕(c _(L) , c _(L) , . . . c _(L)),where d_(L) is shown as the all zero symbol.

More specifically, the selection processor 16 determines first thatc_(L) cannot assume the values [0000] and [1111], since the appendedsymbol d_(L) is the all-zero symbol and c_(L)+0=c_(L). In this example,the all zero symbol is included in the L^(th) position in the sequenceof data symbols, however, the all-zero symbol may instead be included inany predetermined location in the sequence. The processor alsodetermines that c_(L) cannot assume any value which produces acombination d_(j)+c_(L) equal to either [0000] or [1111], for each ofthe other value of d_(j). The system then selects c_(L) as one of theremaining 4-bit values, and the encoder 12 combines each of the symbolsd_(j) with the selected symbol c_(L). The inclusion of c_(L) as a symbolin a predetermined location within the L+1 symbol sequence c_(j)facilitates decoding, as discussed below.

There are a maximum of bL values that cannot be used for c_(L), where bis the number of prohibited symbols, or in the example, b=2. An m-bitsymbol can assume 2^(′″) values. Accordingly, for a rate L/(L+1) code toexist for all possible m-bit symbols the following must be satisfied:bL<2^(m)−bdividing by b givesL<(2^(m)/b)−1.The value of L must be strictly less than (2^(m)/b)−1 and thus if b is apower of 2, i.e., b=2^(s), the maximum value for L isL_(max)=2^(m-s)−2.If b is not a power of 2,L=floor (2^(m)/b)−1,where the function floor(y) is defined as the largest integer smallerthan y. For any value of b, the maximum value of L is thusL _(max)=ceiling(2^(m) /b)=2,where the function ceiling(y) is defined as the smallest integer greaterthan or equal to y. The best code rate R is then:

$R = \left\lbrack \frac{{ceiling}\left( {\frac{2^{m}}{b} - 2} \right)}{{ceiling}\left( {\frac{2^{m}}{b} - 1} \right)} \right\rbrack$

In the example, m=4 and there are two prohibited symbols. Therefore,L _(max)=(2⁴⁻¹)−2=6and the best code rate is

$\frac{6}{7}.$The system thus combines respective sequences of six data symbols withselected symbols c_(L) to produce corresponding 7-symbol sequences thatmeet the modulation requirements.

In the example the six data symbols d₅, d₄ . . . d₀ are:

-   -   [0000],[0001],[0010],[0011],[0100],[0101]        and an all-zero symbol is appended as d₆ to produce        [0000],[0000],[0001],[0010],[0011],[0100],[0101].        The prohibited values of c_(L) for the respective symbols d_(j)        satisfy        c _(L) +d _(j)≠[0000] and c _(L) +d _(j)≠[1111]        or        for d₆=[0000], c_(L) cannot be [0000] or [1111]        for d₅=[0000], c_(L) cannot be [0000] or [1111]        for d₄=[0001], c_(L) cannot be [0001] or [1110]        for d₃=[0010], c_(L) cannot be [0010] or [1101]        for d₂=[0011], c_(L) cannot be [0011] or [1100]        for d₁=[0100], c_(L) cannot be [0100] or [1011]        for d₀=[0101], c_(L) cannot be [0101] or [1010]        There are thus 12 values that c_(L) cannot assume, and the        16−12=4 values c_(L) can assume are:        [0111],[1000],[0110],[1001]        The selection processor 16 selects one of these 4 remaining        values as c_(L). In the example, the processor selects [1000],        and the L+1 encoded symbols are c_(L)+d_(j) for j=0,1, . . . L,        or:        [1000],[1000],[1001],[1010],[1011],[1100],[1101].

To recover the data, the selected symbol c_(L) is removed from thecorresponding symbols c_(j). Thus, d_(j)=c_(j)+c_(L) for j=0, 1, . . . ,5, where c_(L)=c₆. The corresponding decoder consists of one or more XORgates (not shown).

Section B

A second approach to the high code rate encoding may be used inconjunction with data codes over GF(2^(m)). The operations of GaloisField addition and Galois Field multiplication are defined over thefield as, respectively, bit-by-bit XOR'ing of two Galois Field elementsand polynomial multiplication of two elements modulo the associatedprimitive polynomial, where the elements are treated as degree m-1polynomials.

For m=4, the Galois Field elements are treated as degree threepolynomials and multiplication is modulo the primitive polynomialx⁴+x+1. The Galois Field elements of GF(2^(m)) may be written as powersof a primitive element α and/or as 4-bit symbols. For example, thesymbol [0001] is α⁰, the symbol [0010] is α¹ and so forth. The symbolsof GF(2⁴) are thus:0=[0000], α⁰=[0001], α¹=[0010], α²=[0100], α³=[1000], α⁴=[0011],α⁵=[0110],α⁶=[1100], α⁷=[1011], α⁸=[0101], α⁹=[1010], α¹⁰=[0111], α¹¹=[1110],α¹²=[1111],α¹³=[1101], α¹⁴=[1001]

With both addition and multiplication defined over the Galois Field, thehigh rate code encoding uses a fixed sequence of non-zero symbols f_(j)and a selected Galois Field element c_(L) such that the L+1 symbolencoded sequence isc _(j) =c _(L) *f _(j) +d _(j) for j=0,1 . . . L,with an all zero symbol appended as d_(L). The combining of the symbolsd_(j) with the corresponding f_(j) and c_(L) may can also be describedusing matrix operations as:(c _(L) , c _(L-1) . . . c ₁ , c ₀)=(0, d _(L-1) , . . . d ₁ , d ₀)⊕[c_(L){circle around (x)}(f _(L) , f _(L-1) . . . , f ₁ , f ₀)]

The symbol f_(L) may but need not be α⁰, such that the L^(th) symbol ofthe encoded sequence is c_(L). Otherwise, the value of the selectedsymbol may be calculated by removing f_(L) from the symbol in the L^(th)position of sequence. The remaining symbols f_(L-1), f_(L-2), . . . f₀may be arbitrarily selected non-zero elements of GF(2^(m)).

The encoding process selects a value for c_(L) such that the respectivesymbols c_(j) satisfy the modulation requirements. The operations of thesystem are again explained by way of an example.

Let the symbols d_(j) be the sequence[0000], [0111], [0001], [0010], [0011], [0100], [0101]or0, α¹⁰, α⁰, α¹, α⁴, α², α⁸which includes the all-zero symbol as d_(L), and the fixed sequence ofsymbols f_(j) be[0001], [1100], [0001], [0010], [0010], [0100], [0100]orα⁰, α⁶, α⁰, α¹, α¹, α², α².The system selects c_(L) by first determining which values of c_(L) areprohibited.For d₆=[0000] and f₆=[0001]=α⁰: (c_(L){circle around (x)} α⁰) ⊕[0000]≠[0000] or c_(L) {circle around (x)} α⁰≠[0000] and (c_(L){circlearound (x)} α⁰) ⊕ [1111]≠[1111] or c_(L) {circle around (x)} α⁰≠[1111].Accordingly, c_(L) cannot be either [0000] or [1111].For d₅=[0111]=α¹⁰ and f₅=[1100]α⁶: (c_(L) {circle around (x)} α⁶)⊕[0111]≠[0000] or {circle around (x)} α⁶≠[0111]= and (c_(L) {circlearound (x)} α⁶) ⊕[0111]≠[1111] or (c_(L) {circle around (x)} α⁶)≠[1000].Accordingly, c_(L) cannot be either [0011] or [1111].For d₄=[0001]=α⁰ and f₄=[0001]=α⁰: (c_(L) {circle around (x)} α⁰)⊕[0001]≠[0000] or c_(L) {circle around (x)} α⁰≠[0001] and (c_(L) {circlearound (x)} α⁰) ⊕[0001]≠[1111] or c_(L) {circle around (x)} α⁰≠[1110].Accordingly, c_(L) cannot be either [0001] or [1110].For d₃=[0010]=α¹ and f₃=[0010]=α¹: (c_(L) {circle around (x)} α¹)⊕[0010]≠[0000] or c_(L) {circle around (x)} α¹≠[0010] and (c_(L) {circlearound (x)} α¹) ⊕[0010]≠[1111] or c_(L) {circle around (x)} α¹≠[1102].Accordingly, c_(L) cannot be either [0001] or [1111].For d₂=[0011]=α⁴ and f₂=[0010]=α¹: (c_(L) {circle around (x)} α¹)⊕[0011]≠[0000] or c_(L) {circle around (x)} α¹≠[0011] and (c_(L) {circlearound (x)} α¹) ⊕[0011]≠[1111] or c_(L) {circle around (x)} α¹≠[1100].Accordingly, c_(L) cannot be either [1000] or [0110].For d₁=[0100]=α² and f₁=[0100]=α²: (c_(L) {circle around (x)} α²)⊕[0100]≠[0000] or c_(L) {circle around (x)} α²≠[0100] and (c_(L) {circlearound (x)} α²) ⊕[0100]≠[1111] or c_(L) {circle around (x)} α²≠[1011].Accordingly, c_(L) cannot be either [0001] or [0110].For d₀=[0101]—α⁸ and f₀=[0100]=α²: (c_(L) {circle around (x)} α²)⊕[0101]≠[0000] or c_(L) {circle around (x)} α²≠[0101] and (c_(L) {circlearound (x)} α²) ⊕[0101]≠[1111] or c_(L) {circle around (x)} α²≠[1010].Accordingly, c_(L) cannot be either [1100] or [1011].The system then selects c_(L) from the 16−11=5 remaining symbolsα²=[0100]α⁸[0101]α⁹=[1010]α¹⁰=[0111] and α¹⁴=[1001]In the example, the selection processor selects c_(L)=α⁸ andc ₆ , c ₅ , c ₄ , c ₃ , c ₂ , c ₁ , c ₀=(0, d ₅ , d ₄ , d ₃ , d ₂ , d ₁, d ₀){circle around (x)}[α⁸{circle around (x)}(f ₆ , f ₅ , f ₄ , f ₃ ,f ₂ , f ₁ , f ₀)]=[0101], [1110], [0100], [1000], [1001], [0011], [0010]orα⁸, α¹¹, α², α³, α¹⁴, α⁴, α²If every f_(j) is instead equal to α⁰, the encodingc_(j)=c_(L)*f_(j)+d_(j) produces the same result of the encoding ofsection A, since c_(L)*f_(j)=c_(L) for every j.

During decoding, the system recovers the data symbols asd _(j) =c _(j)+(c _(L) *f _(j)) for j=0, 1, . . . , 5, with c ₆ =c _(L).

Section C

If the code is a shortened (n,k) Reed Solomon error correction code(ECC) over GF(2^(m)), the data symbols are encoded into a code word D ofthe ECC code and the fixed pattern F is also a code word of the ReedSolomon code. The “information symbols” f_(n-1) to f_(r) are arbitrarilyselected non-zero elements of GF(2^(m)), with f_(n) selected as α⁰ inthe example. The symbols f_(r-1) to f₀ are the associated n−k−rredundancy symbols. For convenience, all of the information symbols of Fmay be selected to be α⁰, such that the fixed code word is:F=α⁰, α⁰, . . . , α⁰, h_(r-1), . . . , h₁, h₀If, however, the code word with all α⁰ information symbols includes oneor more all zero redundancy symbols, the code word F may instead be(α⁰, α⁰, . . . , α⁰, α⁰, h_(r-1), h_(r-2, h) _(r-3), . . . , h₁,h₀)+[α^(p)*(0, 0, . . . 0, g_(r), g_(r-1), g_(r-2), . . . , g₂, g₁, g₀)]where α^(p) is a selected element of GF(2^(m)) and the g_(j)'s are thecoefficients of the ECC generator polynomial:g(x)=g _(r) x ^(r) +g _(r-1) x ^(r-1) + . . . +g ₁ x+g ₀.The fixed-symbol code word F is thus α^(p)* (α⁰, α⁰ . . . α⁰h′_(r),h′_(r-1) . . . , h′₀) and an α^(p) exists as long as r<n L_(max).

Assuming n L_(max), the system produces an L+1 symbol transmission codeword C by selecting the value α^(p) as discussed above in Section B andC=(c_(n), c_(n-1), . . . , c₁, c₀)=(0, d_(n-1), . . . , d₁,d₀)+[α^(p)*(α⁰, α⁰, . . . α⁰, h′_(r), h′_(r-1), . . . , h′₁, h′₂)]whered_(j) are the data and redundancy symbols of a code word D.

The decoding system first decodes the transmission code word C inaccordance with the ECC, to correct any errors. The system then recoversd_(j) by determining:

For data symbols d_(n−1), d_(n−2), . . . , d_(r+1), that is, for j=n−1,n−2, . . . r+1 d_(j)=c_(j)+α^(p)=c_(j)+c_(n)

For the data symbol j=r,d_(r)=d_(r)+(α^(p)*h′_(r))=c_(r)+(c_(n)*h′_(r)), and

For the redundancy symbols d_(r−1), d_(r−2), . . . , d₀, that is, forj=r−1, r−2, . . . , 0 d_(j)=c_(j)+(α^(p)*h′_(j))=c_(j)+(c_(n)*h′_(j)).

Assuming now that n>L_(max), two or more fixed symbol code words F₀, F₁. . . F_(v) are required such that F=F₀+F₁+. . . +F_(v). The code wordsare selected such that non-zero information symbols of the respectivecode words F_(i) correspond to particular segments of the data code wordthat include L_(max) or fewer symbols.

In the example, 3*L_(max)>n>2*L_(max), and three fixed-symbol code wordsare required. The result is an n+3 symbol transmission code word C thatincludes the values α^(p0), α^(p1), and α^(p2) in predetermined symbollocations. In the example, the fixed-symbol code words F₂, F₁ and F₀ areselected such the code word F=F₀+F₁+F₂ has α⁰ in the locations thatcorrespond to c_(n+2), c_(n+1) and c_(n), to simplify the decodingoperations.

The code word manipulation processor appends three all zero symbols tothe data code word to produce:(0, 0, 0, d_(n−1), d_(n−2), . . . d₁, d₀)and the code words F₂, F₁ and F₀ are then:

Note that the respective fixed-symbol code words F₁ have predeterminedones of the symbols f_(n+2), f_(n+1) and f_(n) set to α⁰ and theremaining set to all zeros. Further, the fixed-symbol code words F₂ andF₁ each have L_(max) predetermined information symbols set to α⁰ withthe remaining information symbols set to all zeros. In particular, F₂includes information symbols f_(n−1) to f_(n−1−Lmax), set to α⁰ and F₁includes information symbols f_(n−1−Lmax−2) to f_(L) set to α⁰. The lastr symbols of the code words F₂ and F₁ are the r redundancy symbols thatcorrespond to the respective code word information symbols. For thesecode words, the redundancy symbols, may include one or more all zerosymbols. The code word F₀ has all zeros for the information symbols thatcorrespond to the symbols that are set to α⁰ in F₂ and F₁. Further, F₀has the next L-r-1 information symbols set to α⁰ and, as discussedabove, the remaining symbols f_(r) to f₀ are the associated redundancysymbols h′_(r) to h′₀. While the redundancy symbols of the code words F₁and F₂ may include one or more all-zero symbols, the combinations of thecorresponding redundancy symbols of the code words F_(i) may not includeall-zero symbols.

Referring now to FIG. 2, an ECC encoder 20 encodes the data symbols toproduce the data code word D. The selection processor 16 firstdetermines a value of α^(p2) that ensures that the L_(max) informationsymbols of C that correspond to the α⁰ information symbols of F₂ do nothave the prohibited values. In the example, the prohibited values areeither all zeros or all ones. Thus, the value of α^(p2) is selected suchthat the symbols c_(n−1) to c_(n−1−Lmax) do not have the values [0000]or [1111]. Similarly, the selection processor determines a value ofα^(p1) such that the next L_(max) symbols of C, that is, the symbolsthat correspond to the α⁰ information symbols of F₁ do not have thevalues [0000] or [1111]. Finally, the value of α^(p0) is selected suchthat the remaining symbols of C, which include the remaining L-rinformation symbols and the r redundancy symbols do not have values of[0000] or [1111]. The code word F is then α^(p0)F₀+α^(p1)F₁+α^(p2)F₂.

In matrix notation, the system determines α^(p1) and α^(p2) such that

-   -   (t_(n+2), t_(n+1), t_(n), . . . t₁, t₁, t₀)=    -   (0,0,0,d_(n−1),d_(n−2),d_(n−3), . . . , d₂,d₁,d₀) ⊕[a_(n+2)        (a⁰,0 0,a⁰, a⁰, . . . , a⁰,0,0, . . .        ,0,0,0,0,0,0,0,0,h_(r−1),h⁴⁻²,h_(r−3), . . . ,h₂h₁,h₀)]    -   ⊕[a_(n+1)        (0,a⁰,0,0,0, . . . ,0,0,a⁰,a⁰,a⁰, . . . ,a⁰,0,0,0,0 . . .        0,h′_(r−1),h′r−₂,h′_(r−3), . . . ,h′₂,h′₁,h′₀)]        and    -   t_(j)≠[0000] or [1111] for j=n−1,n−2,n−1,n−2 . . . ,n−2*L_(max)        and t_(j)≠[0000] or [1111] for j=n−1, n−2, n−1, n−2 . . . ,        n−2*L_(max)

The system then determines α^(p0) such that(t_(n+2),t_(n+1),t_(n), . . . t₁,t₁,t₀)=(0,0,0,d_(n−1),d_(n−2),d³⁻³, . .. ,d₂,d₁,d₀)⊕[a_(n+2)

(a⁰,0,0,a⁰,a⁰, . . . ,a⁰,0,0, . . . ,0,0,0,0,0,0,0,0,0, h_(r−1),h_(r−2),h_(r−3), . . . ,h₂,h₁,h₀)]⊕[a_(n+1)

(0,a⁰, 0,0,0, . . . ,0,0a⁰,a⁰,a⁰, . . . ,a⁰,0,0,0,0 . . . 0,h′_(r−1),h′_(r−2),h′_(r−3), . . . ,h′₂,h′₁,h′₀)]and t_(j)≠[0000] or [1111] for j=n−1, n−2, n−2 . . . , n−2*L_(max)The system then determines α^(p0) such that(c_(n+2),c_(n+1),c_(n),c_(n+1), . . . ,c₂c₁,c₀)=(t_(n+2),t_(n+1),t_(n),. . .t₁,t₁,t₀)⊕[a_(n)

(0,0,a⁰,0,0,0, . . . ,0,0,a⁰,a⁰,a⁰, . . .,a⁰,h″₄,h″_(r−1),h″_(r−2),h″_(r−3), . . . ,h″₂,h″₁,h″₀)]and c_(n)≠[000] and [1111] for j=0, 1 . . . n+2.The data code word processor then produces the transmission code word bycombining the code word F with the data code word D and appendedall-zero symbols.

For decoding, the decoder manipulates the transmission code word C inaccordance with the Reed Solomon ECC, to correct errors in the codeword. The data symbols d_(n−1) to d_(r) are then recovered from theerror free information symbols of C by removing the code word F.

Section D

This section describes the operations involved in selecting fixed-symbolcode words F_(i) for linear codes in general, that is, for binary linearcodes, such as, for example, low density parity check codes. The fixedsymbols are then used with the data and associated redundancyinformation in the manner discussed above.

The fixed-symbol code words are selected based on the underlying linearcode. A linear code is described by its parity check matrix, orequivalently by its generator matrix, with the generator matrices ofknown linear codes typically listed in encoding textbooks. Every codeword is a linear combination of the rows of the generator matrix.Accordingly, for information “u,” the corresponding code word isv=u{circle around (x)}G.

For an (n,k) systematic code, that is, a code in which the data areunaltered during encoding, the generator matrix includes a k×k identitymatrix as a sub-matrix. For a (28,20) code, for example, the generatormatrix includes a 20×20 identity matrix as a sub-matrix. All linearcodes do not necessarily have “symbols” and may instead have informationbit sequences. Thus, a code word of the (n,k) linear code may have kinformation bits and n-k redundancy bits. While the following discussionassumes a binary code, the techniques can be readily generalized fornon-binary linear codes.

The system treats bit sequences of a selected length as “symbols.” Theselected length is m and preferably m is selected such that k isdivisible by m. In the example, k=20 bits and the selected symbol sizeis m=4. As illustrated in FIG. 3, the generator matrix G can besub-divided into m×m, or in the example, 4×4, sub-matrices. Asub-divided matrix G′ is then formed by adding together thecorresponding rows of the 4×4 sub-matrices. Thus, the first rows of thesub-matrices of G′ are the sums of the first rows of the correspondingm×m sub-matrices of G.

The matrix G′ has

$\frac{n}{m}$or

$\frac{28}{4}$=7 4×4 sub-matrices, of which

$\frac{k}{m}$or

$\frac{20}{4}$=5 are identity sub-matrices that correspond to the code word m-bitinformation “symbols.” Assuming that there are a sufficient number ofindependent rows in G′,

The m rows of the matrix G′ are the fixed-symbol code words F₀, F₁, . .. , F_(m−1), with the corresponding row of each sub-matrix providing therespective m-bit symbols for each code word. The fixed-symbol code wordF that is combined with the data code word is a combination of thefixed-symbol code words F₀, F₁ . . . F_(m−1) and a selected vector a,with the elements a_(i) of the vector a selected in the manner discussedabove. Thus, the symbols of the transmission code word C arec _(j) =dj+[a ₀ , a ₁ , . . . , a _(m−1) ]{circle around (x)}G′ _(j),for j=0, 1, . . . k/m−1where G′_(j) is the corresponding m×m submatrix of G′. The number offixed-symbol code words required is based on the code length n, asdiscussed above, and thus, certain a_(i) may be zero if all m of thefixed-symbol code words are not required.

To ensure that the rows of the matrix G′ provide an appropriate numberof independent fixed-symbol code words F_(i), the system calculates theranks of the respective sub-matrices G′_(j). Accordingly, the lower therank, the fewer choices for the various respective symbols of the codewords F₀, F₁, and so forth. As shown in FIG. 3, the identitysub-matrices each have full rank of R=4. The remaining sub-matrices haveranks of 2 and 3, respectively.

For every m-tuple or m-bit “symbol,” there are 2^(m) possible values andin the example there are b=2 prohibited symbols, namely, [0000] and[1111]. Thus, for a given d_(j) there are at least two excluded valuesfor f_(j), where f_(j) is the corresponding symbol of F. If the rank ofthe corresponding sub-matrix G_(j)′ is less than full, there may also beother excluded values since the possible values for a given code wordsymbol f_(j) are combinations of the rows of the correspondingsub-matrix.

For a given sub-matrix G_(j)′ of rank R_(j), there are b*2^(m-R) ^(j)excluded values. To determine if the sub-matrices G_(j)′ are capable ofproducing the necessary code words the system produces the sum:

$S = {\sum\limits_{0}^{\frac{n}{m} - 1}\; 2^{m - R_{j}}}$and the maximum number of prohibited symbols b_(max)=(2^(m)−1)/S. In theexample depicted in FIG. 3, S=11, m=4, and b_(max)=1. Thus, the ranks ofthe sub-matrices of G′ depicted in FIG. 3 are not large enough.

To produce sub-matrices of higher rank, the system permutes various rowsof G to produce a matrix G″ that has associated sub-matrices G_(j)″ withhigher ranks. As discussed, the corresponding rows of the m×msub-matrices of G are combined to produce the respective rows of the m×msub-matrices of G are combined to produce the respective rows of the m×msub-matrices G′_(j). Accordingly, one approach to permuting the rows ofG is to determine associated m×m permutation sub-matrices that operateon the various sub-matrices of G. Thus, the system determines an m×mpermutation sub-matrix P₀ that operates on the sub-matrices that includerows 0 to m−1 of G, a permutation sub-matrix P₁ for the sub-matricesthat include rows m to 2m−1 of G, and so forth. Preferably, many of thepermutation sub-matrices P_(i) are identity matrices, and thus, thecorresponding rows of G are not changed. In the example, the permutationsub-matrices P₁, P₂, P₃ and P₄ are identity matrices such that rows 0 to15 are unchanged, and

$P_{s} = \left\lbrack \begin{matrix}0100 \\0001 \\0010 \\1000\end{matrix} \right\rbrack$such that rows 16 to 19 of G are permuted and

$G^{''} = \left\lbrack \begin{matrix}1000 & 1000 & 1000 & 1000 & 0100 & 0010 & 1011 \\0100 & 0100 & 0100 & 0100 & 0001 & 0100 & 1100 \\0010 & 0010 & 0010 & 0010 & 0010 & 1010 & 0110 \\0001 & 0001 & 0001 & 0001 & 1000 & 1101 & 0010\end{matrix} \right\rbrack$Each sub-matrix G_(j)″ then has rank of R_(j)=4, and the code wordsF_(i), which are the rows of G″, are an optimal set of code words.Generally, for larger codes and as m increases the ranks of therespective sub-matrices G_(j)′ are equal or at least close to m.Accordingly, the permutation of the rows of G is not typically requiredfor the larger codes.

After determining the code words F_(i), the system then selects them-tuples a_(i) for the elements of the vector a in the manner discussedabove. The number of m-tuples that may be tried as values of a_(i)depends on the ranks of the respective sub-matrices G′_(j) or G″_(j).With the rank of each sub-matrix essentially equal to m or m−1, thenumber of m-tuples is between n/m and 2*(n/m). The maximum number ofm-tuples to try is then b*2*(n/m), where b is the number of prohibitedsymbols. Accordingly, a small subset of the possible 2^(m) values istried.

One approach is to select a non-zero m-tuple as essentially a seed valueand use an arbitrary shift register or other random number generator toproduce the desired number of m-tuples. Thus, the system may use alinear feedback shift register to shift some number of times to producea next m-tuple, and so forth. The shift register may, for example, shiftx*S+1 times to produce the respective trial values. The system thentests the appropriate number of values a_(i) and substitutes new trialvalues as necessary.

For a non-systematic (n,k) code, that is, a code that alters the dataduring encoding, the k×n generator matrix G has rank k and there arenon-singular row and column permutation matrices such thatPGQ=G*where P is a row permutation matrix, Q is a column permutation matrixand G* includes a k×k identity sub-matrix. The permuting of the rowschanges the systematic code within the same code space, while thepermuting of the columns transforms the code into a non-systematic code.The matrix G* is then sub-divided into m×m sub-matrices, and the fixedcode words F₀, F₁, . . . F_(m−1) are obtained after re-transforming thecode, that is, by combining the corresponding rows of the m×msub-matrices of (G*)(Q⁻¹), to produce sub-matrices G_(j)′.

If the permutation of the rows of G or (G*)(Q⁻¹) does not produce fullrank sub-matrices G_(j)′ in the manner discussed above, the system maydetermine a full rank sub-matrix s for a given m×m matrix A by findingall possible solutions for the matrix equation xA=s and selecting thesolution with the highest rank. First, using row operations only,determine PA=Z, where Z has k non-zero rows and m-k all zero rows. Usingcolumn operations only, i.e., a column permutation matrix Q, reduce Z tothe canonical form:

${PAQ} = {{ZQ} = \left\lbrack \frac{I_{k \times k}}{0_{{({m - k})} \times k}} \middle| \frac{B}{0_{{({m - k})} \times {({m - k})}}} \right\rbrack}$where I_(k×k) is a k×k identify matrix and O_(r×s) is an r×s matrix withall zero elements. To solve xA=s, first try to solveyPAQ=c

${y\left\lbrack \left. \frac{I_{k \times k}}{0_{{({m - k})} \times k}} \middle| \frac{B}{0_{{({m - k})} \times {({m - k})}}} \right. \right\rbrack} = \mspace{175mu}{{\left\lbrack {y_{1}, y_{2}} \right\rbrack*\left\lbrack \left. \frac{I_{k \times k}}{0_{{({m - k})} \times k}} \middle| \frac{B}{0_{{({m - k})} \times {({m - k})}}} \right. \right\rbrack} = {\left\lbrack {y_{1},{y_{1}* B}} \right\rbrack = \left\lbrack {c_{1}, c_{2}} \right\rbrack}}$where y₁ is a 1×k matrix and y₂=1×(m-k) matrix. Check if c₁*B=c₂ and ifso, one solution is x₀=[c₁, c₂]*P. Further, calculating c=sQ=[c1, c2]and c₁*B=c₂, all solutions are of the form x₁=x₀+[O,z]*P, where z is any1×(m-r) row matrix and there are therefore 2^((m-r)) solutions.

We have depicted that system as including a plurality of processors,such as the data manipulation processor and the selection processor. Theprocessors may be combined into a single processor or arranged asvarious other groupings of processors. The instructions for theoperations that the processors perform may be stored on memory residenton the respective processors, or on memory that is resident on certainof the processors and shared with or made available to other processors.Alternatively, the instructions for one or more of the operations may bemade available to or communicated to the processors by, for example, asystem controller (not shown).

The system is readily implemented by means of one or more digitalprocessors, either general purpose or special purpose. Conventional dataprocessing software and algorithms are readily applied to perform therequisite processing described herein.

1. A method of encoding data including the steps of: A. manipulating thedata to include one or more additional predetermined m-tuples; B.determining a set of one or more prohibited m-tuples that combine withrespective m-tuples of the data and the predetermined m-tuples toproduce m-tuples that do not meet modulation requirements; and C.selecting one or more m-tuples that are not prohibited m-tuples andcombining the selected one or more m-tuples with the data and thepredetermined m-tuples to produce a sequence of m-tuples that meet themodulation requirements.
 2. The method of claim 1 wherein the step ofselecting includes selecting m-tuples that combine with one or morefixed sequences of m-tuples to produce selected m-tuples that combinewith the data and predetermined m-tuples.
 3. The method of claim 2wherein the step of manipulating the data includes manipulating the datain accordance with a shortened distance d (n,k) Reed Solomon code toproduce a data code word that includes n-k=r m-bit redundancy symbolsand appending to the data code word the one or more predeterminedm-tuples, the step of determining includes determining one or morefixed-symbol code words F_(i) with selected non-zero symbols, and thestep of selecting includes selecting for each fixed-symbol code word acorresponding m-tuple α^(pi) such that the fixed symbols are multipliedby the corresponding α^(pi), and the results are combined with the datacode word and the appended predetermined m-tuples to produce acorresponding transmission code word in which the respective symbolsmeet the modulation requirements.
 4. The method of claim 3 wherein thedetermining step further includes determining the number of fixed-symbolcode words based on the code length and the modulation requirements, andthe manipulation step further includes appending to the data code word anumber of predetermined m-tuples that corresponds to the number offixed-symbol code words.
 5. The method of claim 4 wherein thepredetermined m-tuples are all set to the value α⁰ and the correspondingsymbols in the transmission code word have values that correspond to theselected α^(pi).
 6. The method of claim 4 wherein the step ofdetermining includes determining fixed-symbol code words in which thefixed symbols are combinations of corresponding rows in m×m sub-matricesof the k×n code generator matrix G.
 7. The method of claim 6 wherein thestep of determining further includes for a non-systematic codemanipulating the code generator matrix G by permuting rows with a rowpermutation matrix P and permuting columns with a column permutationmatrix Q to produce a generator matrix G* for an associated systematiccode as PGQ and determining the fixed symbols of the fixed-symbol codewords as combinations of corresponding rows in m×m sub-matrices of thematrix G*Q⁻¹.
 8. A method of encoding data symbols including the stepsof A. manipulating the data symbols in accordance with a shorteneddistance d (n,k) Reed Solomon code to produce r redundancy symbols andincluding the data symbols, the redundancy symbols and one or morepredetermined appended symbols in a data code word D; B. selecting oneor more symbols α^(pi) to combine with one or more fixed-symbol codewords F_(i) to produce a code word F=α^(pm)F_(m)+α^(pm-1)F_(m-1)+ . . .+α^(p0)F₀ that when combined with the data code word produces atransmission code word that includes respective symbols that meetmodulation requirements; and C. combining the data code word D and theselected code word F to produce a transmission code word C withrespective symbols that meet the modulation requirements.
 9. The methodof claim 8 wherein the step of selecting includes determining a numberof fixed-symbol code words required based on the code length and thenumber of symbols prohibited by the modulation requirements, for all butone of the respective fixed-symbol code words setting to α⁰ Linformation symbols that combine with L corresponding symbols of thedata and setting the remaining information symbols that correspond tothe data to all zeros and for the one code word setting to α⁰ L-rinformation symbols that combine with L-r corresponding symbols of thedata and a first redundancy symbol, and setting the remaininginformation symbols that correspond to the data to all zeros, andselecting for each fixed-symbol code word F_(i) a symbol α^(pi) suchthat the non-zero symbols of the respective code words combine with thecorresponding symbols of the data code word to produce correspondingsymbols of the transmission code word.
 10. A method of encoding datasymbols including the steps of A. appending a predetermined m-tuple to Lm-tuples of data; B. determining a set of one or more prohibitedm-tuples that combine with respective m-tuples of the data and thepredetermined m-tuple to produce m-tuples that do not meet modulationrequirements; and C. selecting one or more m-tuples that are notprohibited m-tuples and combining the selected one or more m-tuples withthe L data m-tuples and the predetermined m-tuple to produce L+1m-tuples that meet the modulation requirements.
 11. The method of claim10 wherein the step of selecting includes selecting a single m-tuplethat combines with respective data m-tuples and the predeterminedm-tuple.
 12. The method of claim 10 wherein the step of selectingincludes selecting an m-tuple that combines with a sequence of L+1 fixedm-tuples to produce L+1 selected m-tuples that combine with the data andpredetermined m-tuple.